The IIoT Stack for Renewable Energy Sites — A PM’s Field Guide
If you are a product manager working on software for renewable energy sites — solar farms, battery energy storage systems (BESS), or hybrid microgrids — you will quickly realise that your product does not live in isolation. It sits at the top of a long chain of hardware, protocols, gateways, and infrastructure, each layer making decisions that constrain what you can promise your users.
This is a field guide for PMs who want to understand that chain. Not at an engineering depth, but deep enough to write meaningful specs, ask the right questions in technical discussions, and avoid committing to features the architecture cannot support.
The Stack at a Glance
The IIoT stack for a renewable energy site has five layers, each with its own role, technology choices, and product implications:
- Field Layer — Physical devices: inverters, BMS, meters, sensors
- Edge Gateway Layer — Protocol translation and local processing
- Communication Layer — How data moves from site to cloud
- Cloud/Backend Layer — Storage, processing, and business logic
- Application Layer — APIs, dashboards, analytics, and user-facing features
Let us walk through each one.
Layer 1: The Field Layer
This is where energy is actually generated, stored, and measured. A typical renewable energy site will have:
- Inverters (solar, battery-side, or hybrid) that convert DC power to AC and report operational data
- Battery Management Systems (BMS) that monitor cell voltage, temperature, state of charge (SoC), and state of health (SoH)
- Revenue-grade energy meters that measure import/export at the grid connection point
- Environmental sensors for irradiance, temperature, and wind
- Protection relays and SCADA-connected switchgear
These devices speak their own languages. The dominant protocols at this layer are:
- Modbus RTU (serial, RS-485) — the oldest and most widespread; found on almost every inverter and meter
- Modbus TCP — Modbus over Ethernet; more common on newer equipment
- DNP3 — widely used in grid-connected applications and by utilities; supports unsolicited reporting and time-stamped data
- CANbus — common inside battery packs for BMS-to-cell communication
- IEC 61850 — a modern standard for substation automation; increasingly adopted in large BESS projects
- SunSpec — a Modbus-based standard specifically for solar and storage equipment
PM implication: When your team says “we support all inverter brands,” ask which protocols they actually implement. A feature like real-time SoC display is trivial if the BMS speaks Modbus TCP over a clean Ethernet link. It becomes a significant engineering effort if the BMS uses a proprietary CANbus variant and you need a custom driver. Protocol support is a backlog item, not a given.
Layer 2: The Edge Gateway
Field devices rarely connect directly to the cloud. Between them sits an edge gateway — an industrial computer or ruggedised server located at the site. Its primary jobs are:
- Protocol translation — converting Modbus, DNP3, or IEC 61850 into a standard format (usually JSON or a binary protocol) that the cloud can ingest
- Data normalisation — aligning different register maps, unit conventions, and timestamp formats from different device vendors
- Local buffering — storing data temporarily when the communication link to the cloud is unavailable
- Edge analytics — running lightweight logic locally (alarming, control decisions, basic anomaly detection) without cloud round-trips
- Security — acting as a controlled boundary between operational technology (OT) and information technology (IT) networks
Common edge platforms include Moxa, Advantech, and Beckhoff hardware, often running Linux with software stacks like Ignition Edge, Node-RED, or custom Python services. Some modern BESS vendors bundle their own gateway hardware.
PM implication: Edge gateway capabilities directly affect two things your users care about: data latency and offline resilience. If your product promises real-time dashboards, you need to know how frequently the gateway polls field devices (polling interval is a configurable parameter, not a free variable — polling too fast can overload older Modbus devices). If your product promises continuous operation during internet outages, you need to know how much local storage the gateway has and how your application handles data backfill when the link recovers. Write these as explicit acceptance criteria, not assumptions.
Layer 3: The Communication Layer
Data from the edge gateway reaches the cloud via one of several transport options:
- Fibre or fixed Ethernet — highest reliability and bandwidth; available at grid-connected sites near infrastructure
- 4G/5G cellular — the most common option for remote sites; reliable but with data costs and latency variability
- Satellite (e.g. Starlink) — increasingly viable for very remote sites; higher latency than cellular
- Private LTE or LoRaWAN — used in large sites where many distributed assets need to connect within a site boundary
The dominant messaging protocol sitting on top of these transports is MQTT — a lightweight publish/subscribe protocol designed for unreliable networks and constrained devices. The gateway publishes telemetry to topics on an MQTT broker in the cloud, and services subscribe to consume it. AMQP and Kafka are also used at the cloud ingestion point for higher throughput requirements.
PM implication: Communication is where latency budgets get spent and where your SLA commitments meet reality. If a customer asks “why is my dashboard 30 seconds behind?” the answer may live in this layer — a cellular link dropping packets, a gateway buffering aggressively, or an MQTT broker under load. As a PM you need to define data latency SLOs (e.g. 95th percentile of telemetry delivered within 10 seconds of measurement) and work with engineering to instrument and monitor them. Without this, “real-time” is a marketing claim, not a product specification.
Layer 4: The Cloud/Backend Layer
Once data arrives in the cloud, it needs to be stored and processed. This is where the architecture of your software product lives.
Time-series databases are the core of any energy monitoring platform. Unlike relational databases, they are optimised for storing and querying sequences of timestamped values at high ingestion rates. Common choices include:
- InfluxDB — popular in the energy and IIoT space; strong query language (Flux/InfluxQL)
- TimescaleDB — PostgreSQL extension; familiar SQL interface with time-series optimisations
- Apache Cassandra / Amazon Timestream / Azure Data Explorer — for very high scale or cloud-native deployments
Beyond storage, the backend layer typically includes:
- Stream processing — real-time computation on incoming data (e.g. calculating five-minute average power, detecting alarm conditions) using tools like Apache Kafka Streams or AWS Kinesis
- Batch processing — nightly or hourly jobs that compute aggregations, reports, and ML model inputs (Apache Spark, dbt, or simple scheduled scripts)
- ML inference services — endpoints that serve predictions (SoH forecasts, fault detection, energy arbitrage recommendations) based on models trained on historical data
- Asset registry and configuration store — a relational database that knows which devices are on which site, their rated capacities, their communication settings, and their calibration parameters
PM implication: Your choice of time-series database is a long-term architectural commitment with product consequences. It affects how fast you can build new analytics features, what query patterns are efficient, and what your data retention costs look like at scale. If you are defining requirements for a new monitoring product, push engineering early on the question: what is our data model? A poorly designed schema — for example, storing all telemetry in a single wide table rather than per-metric time-series — will make every analytics feature slower to build and slower to run.
Layer 5: The Application Layer
This is the layer your users actually touch: the web application, mobile app, API, and any external integrations.
- REST or GraphQL APIs expose backend data to frontend clients and to customer systems (SCADA, ERP, or energy management platforms)
- WebSocket connections support real-time dashboard updates without constant polling
- Dashboards and visualisations show live operational status, historical trends, and performance KPIs
- Alerting and notification engines deliver SMS, email, or webhook alerts when thresholds are breached
- Reporting modules generate periodic summaries for operations teams, asset owners, and grid operators
- Control interfaces — for BESS specifically — allow operators to dispatch charge/discharge commands, set operating modes, or configure protection settings
PM implication: This is the layer where PMs spend most of their time, but the features you define here are constrained by every layer below. A feature like “show me the SoH trend for each battery string over the last 12 months” requires: the BMS to report string-level data (Layer 1), the gateway to poll and forward it (Layer 2), reliable delivery to cloud (Layer 3), a schema that stores string-level granularity (Layer 4), and an API and UI that can query and render it (Layer 5). If any layer drops string-level data or aggregates it away early, the feature is not deliverable regardless of what the UI design shows. Validate data availability end-to-end before committing features to a sprint.
Putting It Together: Questions Every PM Should Be Able to Answer
Before writing specs or roadmap commitments for an energy IIoT product, make sure you can answer these:
- What protocols does each device type on site use? Do we have drivers for all of them, or does a new device vendor mean a new integration project?
- What is our polling interval, and can it be changed per-site or per-device-type? Who owns that configuration?
- What happens when the site loses internet connectivity? How long can the gateway buffer data? How does our backend handle backfill, and will it affect dashboard continuity?
- What is our end-to-end data latency target, and how do we measure it? Is it monitored in production?
- At what granularity do we store telemetry, and for how long? Is there a tiered retention policy (raw 1-second data for 30 days, 5-minute averages for 2 years)?
- What data does our ML pipeline need, and is it actually available from the field? Has the model been tested on real site data or only on a lab dataset?
These are not engineering questions — they are product questions. The answers determine what you can ship, when you can ship it, and what you can promise a customer.
Final Thoughts
The IIoT stack is not the background infrastructure for your product — it is your product, from end to end. Every layer is a set of decisions made by engineers that either expands or constrains the feature space available to you as a PM.
You do not need to be able to configure a Modbus register map or tune a Kafka consumer group. But you do need to know that these things exist, why they matter, and what questions to ask when a feature feels harder than it looks.
The best PMs in energy software are not those who understand the application layer best. They are the ones who understand the whole stack well enough to know which layer is the real constraint — and to design around it.